Appl. No. 10/694,367 

Amdt. dated October 24, 2008 

Reply to Office Action of July 24, 2008 

Amendments to the Claims: 

This listing of claims will replace all prior versions, and listings, of claims in 
the application: 

Listing of Claims: 

1 . (Currently amended) A processor-based method, comprising: 

selecting a set number of functions correlating variable parameters of a 
dataset; and 

clustering the dataset by iteratively applying a regression algorithm and a 
K-Harmonic Means performance function on the set number of 
functions to determine a pattern in said dataset; 

wherein said clustering comprises determining distances betwee n data 
points of the dataset and values correlated with the set number of 
functions, regressing the set number of functions using data point 
probabilit y and weighting factors associated with the determined 
distances, calculating a difference of harmonic averages for the 
distances determined prior to and subsequent to said regressing, 
and repeating said regr essing, d etermining and calculating upon 
determining the difference of harmonic averages is greater than a 
predetermined value . 

2. (Canceled). 

3. (Original) The processor-based method of claim 2, wherein said 
determining the distances comprises determining distances from each datapoint 
of the dataset to values within each function of the set number of functions. 

4. (Original) The processor-based method of claim 2, wherein said selecting 
and said clustering are conducted for a plurality of datasets each from a different 
data source. 
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5. (Original) The processor-based method of claim 4, wherein said selecting 
and said clustering are conducted in parallel for each of the plurality of datasets. 

6. (Original) The processor-based method of claim 4, further comprising 
determining a common coefficient vector to compensate for variations between 
similar sets of functions within the different data sources. 

7. (Original) The processor-based method of claim 6, wherein said 
determining the common coefficient vector comprises: 

developing matrices from the dataset datapoints and the probability and 
weighting factors for each of the datasets prior to said reiterating; 
and 

determining the common coefficient vector from a composite of the 
developed matrices. 

8. (Original) The processor-based method of claim 7, further comprising 
multiplying the similar sets of functions within the different data sources by the 
common coefficient vector. 

9. (Previously presented) A storage medium comprising program 
instructions executable by a processor for: 

selecting a set number of functions correlating variable parameters of a 
dataset; 

determining distances between datapoints of the dataset and values 

correlated with the set number of functions; 
calculating harmonic averages of the distances; 

regressing the set number of functions using datapoint probability and 
weighting factors associated with the determined distances; 

repeating said determining and calculating for the regressed set of 
functions; 
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computing a change in harmonic averages for the set number of functions 

prior to and subsequent to said regressing; and 
reiterating said regressing, repeating and computing upon determining the 

change in harmonic averages is greater than a predetermined value 

to thereby determine a pattern in said dataset. 

10. (Original) The storage medium of claim 9, wherein the program 
instructions are executable using a processor for computing the datapoint 
probability and weighting factors. 

11. (Original) The storage medium of claim 9, wherein the program 
instructions are executable using a processor for developing matrices from the 
dataset datapoints and the probability and weighting factors prior to said 
reiterating. 

12. (Original) The storage medium of claim 11, wherein the program 
instructions are executable using a processor for amassing matrices developed 
from a plurality of datasets each from a different data source. 

13. (Original) The storage medium of claim 11, wherein the program 
instructions are executable using a processor for determining a common 
coefficient vector from the composite of matrices. 

14. (Original) The method of claim 13, wherein the program instructions are 
executable using a processor for multiplying similar sets of functions within the 
different data sources by the common coefficient vector. 
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1 5. (Currently amended) A system, comprising: 
an input port configured to receive data; and 
a processor configured to: 

regress functions correlating variable parameters of a set of the 
data; 

cluster the functions using a K-Harmonic Mean performance 
function; and 

repeat said regress and cluster sequentially to thereby determine a 
pattern in said set of data; 

wherein the processor clusters the functions by determining distances 
between data points of the dataset and values correlated with a set 
number of functions, regressing the set number of function s using 
data point probability and weighting factors associated with the 
determined distances, calculating a difference of harmonic 
averages for th e distances determined prior t o and subsequent to 
said regressing . 

16. (Original) The system of claim 15, wherein the processor is arranged 
within one of a plurality of data sources each comprising a processor configured 
to: 

regress the functions on a dataset of the respective data source; 

cluster the functions using a K-Harmonic Mean performance function; and 

repeat said regress and cluster sequentially. 

17. (Original) The system of claim 15, further comprising a central station 
coupled to the plurality of data sources, wherein the central station comprises a 
processor configured to compute common coefficient vectors which compensate 
for variations between the regressively clustered functions representing the 
datasets, and wherein each of the processors of the data sources is configured to 
alter the functions by the common coefficient vectors. 
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1 8. (Currently amended) A system, comprising: 
a plurality of data sources; and 

a means for regressively clustering datapoints from the plurality of data 
sources without transferring data between the plurality of data 
sources to thereby determine a pattern in data contained in said 
data sources and for applying a K-Harmonic Means performance 
function on the dataj 

wherein the means for regress ively clustering the datasets comprises a 
storage medium with program instructions executable using a 
processor for selecting a set number of functions correlating 
variable parameters of a dataset determining distances between 
data points of the dataset and values correlated with the set n umber 
of functions, regressing the set number of functions using data point 
probability and weighting factors associated with the determined 
distajTces^_ca!cy latin g a difference . of harmonic averages for the 
distances determined prior to and subsequent to said regressing; 
and reiterating said regressing, determining and calculating upon 
determining the difference of harmonic averages is less than a 
predetermined value . 

19. -21. (Canceled). 

22. (Original) The system of claim 18, further comprising a central station 
communicably coupled to the plurality of data sources, wherein the means is 
further for: 

collecting dataset information at the central station from the plurality of 
data sources; 

determining a common coefficient vector from the collected dataset 
information; and 

altering datasets within the plurality of data sources by the common 
coefficient vector. 
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23. (Canceled). 

24. (Currently amended) A system, comprising: 

a plurality of data sources each having a processor configured to access 
datapoints within the respective data source; and 

a central station coupled to the plurality of data sources and comprising a 
processor, wherein the processors of the central station and 
plurality of data sources are collectively configured to mine the 
datapoints of the data sources as a whole without transferring all of 
the datapoints between the data sources and the central station to 
thereby determine a pattern in datapoints contained in said data 
sources; 



wherein the each o f the processors within the plurality of d ata sources is 
configured to regressively cluster a dataset within the respective 



data source; 






wherein the processor wit 


hin the centra! station is configur 


ed to: 


collect information 


pertaining to the regressively clu 


stered datasets; 


based upon the co 


llected information, calculate co 


mmon coefficient 



vectors which balance variations between functions 



correlating similar variable parameters of the regressively 

cluMered_datasete; 
compute a residual error from the common coefficient vectors: 
propagate the common coefficient vectors to the data sources upon 

com puting a res idual error value greate r than a 

predetermined value; and 
send a message to the data sources to terminate the regression 

clustering of the datasets upon computing a residual error 

value less than a predetermi ned value. 

25.-27. (Canceled). 
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28. (Currently amended) A processor-based method for mining data, 
comprising: 

independently applying a regression clustering algorithm to a plurality of 
distributed datasets by determining distances between data points 
of each dataset and values correlated with a set number of 
functions, regressing the set number of functions using data point 
probability and weighting factors associated with the determined 
distances, calculating a difference of harmonic averages for the 
distances determined prior to and subsequent to application of said 
regression algorithm, and repeating said regressing, determining 
and calculating upon determining the difference of harmonic 
averages is greater than a predetermined value ; 

developing matrices from probability and weighting factors computed from 
the regression clustering algorithm, wherein the matrices 
individually represent the distributed datasets without including all 
datapoints within the datasets; 

determining global coefficient vectors from a composite of the matrices; 
and 

multiplying functions correlating similar variable parameters of the 
distributed datasets by the global coefficient vectors to thereby 
determine a pattern in said datasets. 

29. (Original) The processor-based method of claim 28, further comprising 
repeating said independently applying, said developing, said determining and 
said multiplying. 

30. (Original) The processor-based method of claim 28, further comprising 
calculating a residue error associated with the global coefficients prior to said 
multiplying. 
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